Localizing Natural Language in Videos
نویسندگان
چکیده
منابع مشابه
Localizing Periodicity in Time Series and Videos
Periodicity detection is a problem that has received a lot of attention, thus several important tools exist to analyse purely periodic signals. However, in many real world scenarios (time series, videos of human activities, etc) periodic signals appear in the context of non-periodic ones. In this work we propose a method that, given a time series representing a periodic signal that has a non-pe...
متن کاملLocalizing web videos using social images
While inferring the geo-locations of web images has been widely studied, there is limited work engaging in geo-location inference of web videos due to inadequate labeled samples available for training. However, such a geographical localization functionality is of great importance to help existing video sharing websites provide location-aware services, such as location-based video browsing, vide...
متن کاملLocalizing Web Videos from Heterogeneous Images
While geo-localization of web images has been widely studied, limited effort is devoted to that of web videos. Nevertheless, an accurate location inference approach specified on web videos is of fundamental importance, as it’s occupying increasing proportions in web corpus. The key challenge comes from the lack of sufficient labels for model training. In this paper, we tackle this problem from ...
متن کاملLocalizing and segmenting text in images and videos
Many images—especially those used for page design on web pages—as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. In this paper, we propose a novel method for localizing and segmenting text in complex images and videos. Text lines are ide...
متن کاملIntegrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild
This paper integrates techniques in natural language processing and computer vision to improve recognition and description of entities and activities in real-world videos. We propose a strategy for generating textual descriptions of videos by using a factor graph to combine visual detections with language statistics. We use state-of-the-art visual recognition systems to obtain confidences on en...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2019
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v33i01.33018175